25 research outputs found

    Towards More Accurate and Explainable Supervised Learning-Based Prediction of Deliverability for Underground Natural Gas Storage

    Get PDF
    Numerous subsurface factors, including geology and fluid properties, can affect the connectivity of the storage spaces in depleted reservoirs; hence, fluid flow simulations become more complicated, and predicting their deliverability remains challenging. This paper applies Machine Learning (ML) techniques to predict the deliverability of underground natural gas storage (UNGS) in depleted reservoirs. First, three baseline models were developed based on Support Vector Regression (SVR), Artificial Neural Network (ANN), and Random Forest (RF) algorithms. To improve the accuracy of the RF model as the best-performing baseline model, a unified framework, referred to as SARF, was developed. SARF combines the capabilities of Sparse Autoencoder (SA) and that of Random Forest (RF). To achieve this, the internal representations of the SA, which constitute extracted features of the input variables, are used in RF to develop the proposed SARF framework. The predictive capabilities of the baseline models and the proposed SARF model were validated using 3744 real-world storage data samples of 52 active storage reservoirs in the United States. The experimental result of this study shows that the proposed SARF model achieved an average 5.7% increase in accuracy on four separate data partitions over the baseline RF model. Furthermore, a set of eXplainable Artificial Intelligence (XAI) methods were developed to provide an intuitive explanation of which factors influence the deliverability of reservoir storage. The visualizations developed using the XAI method provide an easy-to-understand interpretation of how the SARF model predicted the deliverability values for separate reservoirs

    Negation and Speculation in NLP: A Survey, Corpora, Methods, and Applications

    Get PDF
    Negation and speculation are universal linguistic phenomena that affect the performance of Natural Language Processing (NLP) applications, such as those for opinion mining and information retrieval, especially in biomedical data. In this article, we review the corpora annotated with negation and speculation in various natural languages and domains. Furthermore, we discuss the ongoing research into recent rule-based, supervised, and transfer learning techniques for the detection of negating and speculative content. Many English corpora for various domains are now annotated with negation and speculation; moreover, the availability of annotated corpora in other languages has started to increase. However, this growth is insufficient to address these important phenomena in languages with limited resources. The use of cross-lingual models and translation of the well-known languages are acceptable alternatives. We also highlight the lack of consistent annotation guidelines and the shortcomings of the existing techniques, and suggest alternatives that may speed up progress in this research direction. Adding more syntactic features may alleviate the limitations of the existing techniques, such as cue ambiguity and detecting the discontinuous scopes. In some NLP applications, inclusion of a system that is negation- and speculation-aware improves performance, yet this aspect is still not addressed or considered an essential step

    Towards Building a Speech Recognition System for Quranic Recitations: A Pilot Study Involving Female Reciters

    Get PDF
    This paper is the first step in an effort toward building automatic speech recognition (ASR) system for Quranic recitations that caters specifically to female reciters. To function properly, ASR systems require a huge amount of data for training. Surprisingly, the data readily available for Quranic recitations suffer from major limitations. Specifically, the currently available audio recordings of Quran recitations have massive volume, but they are mostly done by male reciters (who have dedicated most of their lives to perfecting their recitation skills) using professional and expensive equipment. Such proficiency in the training data (along with the fact that the reciters come from a specific demographic group; adult males) will most likely lead to some bias in the resulting model and limit their ability to process input from other groups, such as non-/semi-professionals, females or children. This work aims at empirically exploring this shortcoming. To do so, we create a first-of-its-kind (to the best of our knowledge) benchmark dataset called the Quran recitations by females and males (QRFAM) dataset. QRFAM is a relatively big dataset of audio recordings made by male and female reciters from different age groups and proficiency levels. After creating the dataset, we experiment on it by building ASR systems based on one of the most popular open-source ASR models, which is the celebrated DeepSpeech model from Mozilla. The speaker-independent end-to-end models, that we produce, are evaluated using word error rate (WER). Despite DeepSpeech’s known flexibility and prowess (which is shown when trained and tested on recitations from the same group), the models trained on the recitations of one group could not recognize most of the recitations done by the other groups in the testing phase. This shows that there is still a long way to go in order to produce an ASR system that can be used by anyone and the first step is to build and expand the resources needed for this such as QRFAM. Hopefully, our work will be the first step in this direction and it will inspire the community to take more interest in this problem

    Building a neural speech recognizer for quranic recitations

    Get PDF
    This work is an effort towards building Neural Speech Recognizers system for Quranic recitations that can be effectively used by anyone regardless of their gender and age. Despite having a lot of recitations available online, most of them are recorded by professional male adult reciters, which means that an ASR system trained on such datasets would not work for female/child reciters. We address this gap by adopting a benchmark dataset of audio records of Quranic recitations that consists of recitations by both genders from different ages. Using this dataset, we build several speaker-independent NSR systems based on the DeepSpeech model and use word error rate (WER) for evaluating them. The goal is to show how an NSR system trained and tuned on a dataset of a certain gender would perform on a test set from the other gender. Unfortunately, the number of female recitations in our dataset is rather small while the number of male recitations is much larger. In the first set of experiments, we avoid the imbalance issue between the two genders and down-sample the male part to match the female part. For this small subset of our dataset, the results are interesting with 0.968 WER when the system is trained on male recitations and tested on female recitations. The same system gives 0.406 WER when tested on male recitations. On the other hand, training the system on female recitations and testing it on male recitation gives 0.966 WER while testing it on female recitations gives 0.608 WER

    Applications in Home Improvement Retailer, Koctas

    Get PDF
    — It sounds like Koçtaş is a leader in the home improvement sector in Turkey, and they are focused on providing the best service and customer experience possible. They are also actively working to accelerate their digital investments and use their vast amount of customer data to innovate in the industry. One way they are using this data is by collecting and analyzing video camera images using AI. This allows them to detect humans and identify which products and shelves are most viewed in their stores. This information can then be used to optimize store layout and product placement for a better customer experience. Another way Koçtaş is innovating is through the implementation of kiosks that use Natural Language Processing (NLP) to interact with customers. These kiosks can understand and respond to questions asked by customers using AI, providing a more personalized and human-like experience. Finally, Koçtaş is using Dynamic Creative Optimization to create personalized advertisements for their customers. This method allows them to optimize the content and format of their ads based on the individual preferences and behavior of their customers, leading to more effective marketing. Overall, Koçtaş is using technology and data to drive innovation and provide a better customer experience in the home improvement industry

    Improving news headline text generation quality through frequent POS-tag patterns analysis

    Get PDF
    Original synthetic content writing is one of the human abilities that algorithms aspire to emulate. The advent of sophisticated algorithms, especially based on neural networks has shown promising results in recent times. A watershed moment was witnessed when the attention mechanism was introduced which paved the way for transformers, a new exciting architecture in natural language processing. Recent sensations like GPT and BERT for synthetic text generation rely on NLP transformers. Although, GPT and BERT-based models are capable of generating creative text given they are properly trained on abundant data, however, the generated text suffers the quality aspect when limited data is available. This is especially an issue for low-resource languages where labeled data is still scarce. In such cases, the generated text, more often than not, lacks the proper sentence structure, thus unreadable. This study proposes a post-processing step in text generation that improves the quality of generated text through the GPT model. The proposed post-processing step is based on the analysis of POS tagging patterns in the original text and accepts only those generated sentences from GPT which satisfy POS patterns that are originally learned from the data. We exploit the GPT model to generate English headlines by utilizing Australian Broadcasting Corporation (ABC) news dataset. Furthermore, for assessing the applicability of the model in low-resource languages, we also train the model on the Urdu news dataset for Urdu news headlines generation. The experiments presented in this paper on these datasets from high- and low-resource languages show that the performance of generated headlines has a significant improvement by using the proposed headline POS pattern extraction. We evaluate the performance through subjective evaluation as well as using text generation quality metrics like BLEU and ROUGE

    Software Defect Prediction Using Artificial Neural Networks: A Systematic Literature Review

    Get PDF
    The demand for automated online software systems is increasing day by day, which triggered the need for high-quality and maintainable softwares at lower cost. Software defect prediction is one of the crucial tasks of the quality assurance process which improves the quality at lower cost by reducing the overall testing and maintenance efforts. Early detection of defects in the software development life cycle (SDLC) leads to the early corrections and ultimately timely delivery of maintainable software, which satisfies the customer and makes him confident towards the development team. In the last decade, many machine learning-based approaches for software defect prediction have been proposed to achieve the higher accuracy. Artificial Neural Network (ANN) is considered as one of the widely used machine learning techniques, which is included in most of the proposed defect prediction frameworks and models. This research provides a critical analysis of the latest literature, published from year 2015 to 2018 on the use of Artificial Neural Networks for software defect prediction. In this study, a systematic research process is followed to extract the literature from three widely used digital libraries including IEEE, Elsevier, and Springer, and then after following a thorough process, 8 most relevant research publications are selected for critical review. This study will serve the researchers by exploring the current trends in software defect prediction with the focus on ANNs and will also provide a baseline for future innovations, comparisons, and reviews

    Heart Disease Prediction Using Machine Learning Method

    Get PDF
    The heart disease is also known as coronary artery disease, many hearts affecting symptoms that are very common nowadays and causes death. It is a challenging task to diagnose heart diseases without any intelligent diagnosing system. Many researchers did research on it and developed a diagnostic system to diagnose heart diseases and worked on it. The prediction of cardiovascular disease, required a brief medical history of patients, including genetic information. The world is in acute need of a system for predicting heart disease and it became crucial. Data mining and machine learning are common techniques used in the field of health care to process large and complex data. This research paper presents reasons for heart disease and a model based on Machine learning algorithms for prediction

    Detection of Dengue Disease Empowered with Fused Machine Learning

    No full text
    Dengue fever is a life-threatening illness that affects both industrialized and poor nations, including Pakistan. It is necessary to forecast the illness at an early stage to avoid it. Machine Learning (ML) methods outperform other computer approaches in terms of illness prediction. The model utilized in this study to predict dengue fever is fused with machine learning. Artificial Neural Networks (ANN) and Support Vector Machine (SVM) provide the foundation of the conceptual framework. The datasets employed in these models have been collected from a government hospital in Lahore, Pakistan for diagnosing dengue fever (positive or negative). 70% of the statistics in the dataset are training data, whereas 30% are testing data. This fused model's membership functions explain whether a dengue diagnostic is positive or negative, which controls the model's output. A cloud storage system saves the fused model based on patients' real-time information for future use. The proposed model has a 96.19 % accuracy rate, which is much greater than earlier research

    Activity Based Easy Learning Of PushDown Automata

    No full text
    Teaching is a skill. It is significant to aware or enquires how to express this skill. Teacher must have fully griped upon his art of teaching. He must know the psychology, mental and emotional distorting elements of students to deal with in all situations accordingly. Learning disability is not a big difficulty or harder-ship and can achieve easily by developing interest and motivating students by introducing some new updated tools and methodologies like Thoth, SELFA, FLUTE, JFLAP, FLAP, Java Finite Automata, Deus Ex Machina (DEM) and homework exercises for practicing by hand, workshops, oral assessments, quizzes, group sharing, group assessment, clustering and feedback respectively that are explain below. This article provides different ways of teaching PDA and makes students learning process easy. Furthermore, this article also clear the conceptual model of PDA and enhance the ability to design PDA machine conveniently
    corecore